Automatic Speech Segmentation with Hmm
نویسندگان
چکیده
ABSTRACT: In this paper we review aspects of our automatic speech segmentation system that has been utilised in conjunction with our speech synthesis research. The speech segmentation system is based on a hidden Markov model phone recogniser using training strategies optimised for the segmentation task. Our research includes an analysis of the various aspects of the phone recogniser’s design and identifying the distinctions between paradigms of parameter estimation for speech segmentation and recognition. We also look at the limitations of HMM based segmentation and techniques for overcoming these limitations. The system evaluation demonstrates the ability of our system to provide high reliability speech segmentation that is comparable in performance to other state of the art systems.
منابع مشابه
Automatic Segmentation Combining and Spectral Boundary
Currently, AT&T Labs’ Natural Voices multilingual TTS system produces high-quality synthetic speech with a largescale speech corpus [1]. In the development of such systems, automatic segmentation constitutes a major component technology. The prevalent approach for automatic segmentation in speech synthesis is Hidden Markov Model (HMM) based. Even though an HMM-based approach is the most automat...
متن کاملAutomatic Speech Segmentation Based on HMM
This contribution deals with the problem of automatic phoneme segmentation using HMMs. Automatization of speech segmentation task is important for applications, where large amount of data is needed to process, so manual segmentation is out of the question. In this paper we focus on automatic segmentation of recordings, which will be used for triphone synthesis unit database creation. For speech...
متن کاملHMM-based automatic visual speech segmentation using facial data
We describe automatic visual speech segmentation using facial data captured by a stereo-vision technique. The segmentation is performed using an HMM-based forced alignment mechanism widely used in automatic speech recognition. The idea is based on the assumption that using visual speech data alone for the training might capture the uniqueness in the facial component of speech articulation, asyn...
متن کاملA study of HMM-based automatic segmentations for Thai continuous speech recognition system
Speech segmentations have been widely using in many speech applications. In speech synthesis, the quality of produced speech depends on the accuracy of labeled acoustic inventory. In speech recognition, segmented utterances according to the labels are usually used as a starting point for training speech models. The segmentation is often manually encoded which is timeconsumption process and has ...
متن کاملA Sphinx Based Speech-music Segmentation Front-end for Improving the Performance of an Automatic Speech Recognition System in Turkish
In this study a system that segments an audio signal as speech and music by using posterior probability based features is proposed and implemented in Sphinx. Unlike the earlier efforts that uses Multi-Layer Perceptrons (MLP), this system uses Hidden-MarkovModel based acoustic models that are trained in Sphinx for posterior probability calculations. Acoustic Models are trained with the HMM-state...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002